Requirement already satisfied: pandas in /Users/minhtran/anaconda3/lib/python3.11/site-packages (1.5.3)
Requirement already satisfied: python-dateutil>=2.8.1 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from pandas) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from pandas) (2022.7)
Requirement already satisfied: numpy>=1.21.0 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from pandas) (1.24.3)
Requirement already satisfied: six>=1.5 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from python-dateutil>=2.8.1->pandas) (1.16.0)
Requirement already satisfied: ggplot in /Users/minhtran/anaconda3/lib/python3.11/site-packages (0.11.5)
Requirement already satisfied: brewer2mpl in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from ggplot) (1.4.1)
Requirement already satisfied: cycler in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from ggplot) (0.11.0)
Requirement already satisfied: matplotlib in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from ggplot) (3.7.1)
Requirement already satisfied: numpy in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from ggplot) (1.24.3)
Requirement already satisfied: pandas in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from ggplot) (1.5.3)
Requirement already satisfied: patsy>=0.4 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from ggplot) (0.5.3)
Requirement already satisfied: scipy in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from ggplot) (1.10.1)
Requirement already satisfied: six in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from ggplot) (1.16.0)
Requirement already satisfied: statsmodels in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from ggplot) (0.14.0)
Requirement already satisfied: contourpy>=1.0.1 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib->ggplot) (1.0.5)
Requirement already satisfied: fonttools>=4.22.0 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib->ggplot) (4.25.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib->ggplot) (1.4.4)
Requirement already satisfied: packaging>=20.0 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib->ggplot) (23.0)
Requirement already satisfied: pillow>=6.2.0 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib->ggplot) (9.4.0)
Requirement already satisfied: pyparsing>=2.3.1 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib->ggplot) (3.0.9)
Requirement already satisfied: python-dateutil>=2.7 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib->ggplot) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from pandas->ggplot) (2022.7)
Requirement already satisfied: matplotlib in /Users/minhtran/anaconda3/lib/python3.11/site-packages (3.7.1)
Requirement already satisfied: contourpy>=1.0.1 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib) (1.0.5)
Requirement already satisfied: cycler>=0.10 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib) (0.11.0)
Requirement already satisfied: fonttools>=4.22.0 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib) (4.25.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib) (1.4.4)
Requirement already satisfied: numpy>=1.20 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib) (1.24.3)
Requirement already satisfied: packaging>=20.0 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib) (23.0)
Requirement already satisfied: pillow>=6.2.0 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib) (9.4.0)
Requirement already satisfied: pyparsing>=2.3.1 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib) (3.0.9)
Requirement already satisfied: python-dateutil>=2.7 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib) (2.8.2)
Requirement already satisfied: six>=1.5 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from python-dateutil>=2.7->matplotlib) (1.16.0)
Requirement already satisfied: seaborn in /Users/minhtran/anaconda3/lib/python3.11/site-packages (0.12.2)
Requirement already satisfied: numpy!=1.24.0,>=1.17 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from seaborn) (1.24.3)
Requirement already satisfied: pandas>=0.25 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from seaborn) (1.5.3)
Requirement already satisfied: matplotlib!=3.6.1,>=3.1 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from seaborn) (3.7.1)
Requirement already satisfied: contourpy>=1.0.1 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (1.0.5)
Requirement already satisfied: cycler>=0.10 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (0.11.0)
Requirement already satisfied: fonttools>=4.22.0 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (4.25.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (1.4.4)
Requirement already satisfied: packaging>=20.0 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (23.0)
Requirement already satisfied: pillow>=6.2.0 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (9.4.0)
Requirement already satisfied: pyparsing>=2.3.1 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (3.0.9)
Requirement already satisfied: python-dateutil>=2.7 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from pandas>=0.25->seaborn) (2022.7)
Requirement already satisfied: six>=1.5 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from python-dateutil>=2.7->matplotlib!=3.6.1,>=3.1->seaborn) (1.16.0)
Requirement already satisfied: adjustText in /Users/minhtran/anaconda3/lib/python3.11/site-packages (0.8)
Requirement already satisfied: numpy in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from adjustText) (1.24.3)
Requirement already satisfied: matplotlib in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from adjustText) (3.7.1)
Requirement already satisfied: contourpy>=1.0.1 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib->adjustText) (1.0.5)
Requirement already satisfied: cycler>=0.10 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib->adjustText) (0.11.0)
Requirement already satisfied: fonttools>=4.22.0 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib->adjustText) (4.25.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib->adjustText) (1.4.4)
Requirement already satisfied: packaging>=20.0 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib->adjustText) (23.0)
Requirement already satisfied: pillow>=6.2.0 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib->adjustText) (9.4.0)
Requirement already satisfied: pyparsing>=2.3.1 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib->adjustText) (3.0.9)
Requirement already satisfied: python-dateutil>=2.7 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from matplotlib->adjustText) (2.8.2)
Requirement already satisfied: six>=1.5 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from python-dateutil>=2.7->matplotlib->adjustText) (1.16.0)
Requirement already satisfied: numpy in /Users/minhtran/anaconda3/lib/python3.11/site-packages (1.24.3)
ERROR: Could not find a version that satisfies the requirement tkinter (from versions: none)
ERROR: No matching distribution found for tkinter
Requirement already satisfied: scipy in /Users/minhtran/anaconda3/lib/python3.11/site-packages (1.10.1)
Requirement already satisfied: numpy<1.27.0,>=1.19.5 in /Users/minhtran/anaconda3/lib/python3.11/site-packages (from scipy) (1.24.3)
import pandas as pdfrom plotnine import*from matplotlib.axis import Axisimport matplotlib.ticker as tickerimport matplotlib.pyplot as pltimport seaborn as sbimport numpy as npfrom adjustText import adjust_textfrom tkinter import*from tkinter import fontfrom scipy import interpolateq1 = pd.read_csv("/Users/minhtran/Desktop/GSB544/Lab 1/Data/q1data.csv")q1
income
life_exp
population
year
country
four_regions
six_regions
eight_regions
world_bank_region
0
1910.0
61.0
29200000.0
2010
Afghanistan
asia
south_asia
asia_west
South Asia
1
11100.0
78.1
2950000.0
2010
Albania
europe
europe_central_asia
europe_east
Europe & Central Asia
2
11100.0
74.7
36000000.0
2010
Algeria
africa
middle_east_north_africa
africa_north
Middle East & North Africa
3
46900.0
81.9
84500.0
2010
Andorra
europe
europe_central_asia
europe_west
Europe & Central Asia
4
7680.0
60.8
23400000.0
2010
Angola
africa
sub_saharan_africa
africa_sub_saharan
Sub-Saharan Africa
...
...
...
...
...
...
...
...
...
...
192
20400.0
75.4
28400000.0
2010
Venezuela
americas
america
america_south
Latin America & Caribbean
193
5350.0
73.3
88000000.0
2010
Vietnam
asia
east_asia_pacific
east_asia_pacific
East Asia & Pacific
194
4700.0
67.8
23200000.0
2010
Yemen
asia
middle_east_north_africa
asia_west
Middle East & North Africa
195
3200.0
57.5
13600000.0
2010
Zambia
africa
sub_saharan_africa
africa_sub_saharan
Sub-Saharan Africa
196
2560.0
54.4
12700000.0
2010
Zimbabwe
africa
sub_saharan_africa
africa_sub_saharan
Sub-Saharan Africa
197 rows × 9 columns
The aesthetics being used are:
Bubble Plot
X-axis: Income
Y-axis: Life Expectancy
Size: population
Color: continents (four_regions)
There are also the labels of Income on the x-axis and Life Expectancy on the y-axis
There are also annotations inside the graph: “at birth” on the y-axis and “GDP per capita” on the x-axis.
There is also a watermark of 2010 behind the graph
The scale of the x-axis is done logarithmically
#Style the seaborn themesb.set_theme()sb.set_style('whitegrid')colors = ["#FF798F", "#7FEB00", "#00D5E9", "#FFE600"]sb.set_palette(sb.color_palette(colors))#Set notebook's figure sizeplt.rcParams['figure.figsize'] = [60, 40]#Set up bubble plotq1_graph = sb.scatterplot(data = q1, x ="income", y ="life_exp", size ="population", legend =False, alpha =0.9, hue ="four_regions", edgecolor ='black', linewidth =2, sizes = (500, 50000), zorder =3)#Scale x-axis & y-axisx_ticks = [500, 1000, 2000, 4000, 8000, 16000, 32000, 64000, 128000]x_labels = [str(x) for x in x_ticks]q1_graph.set_xscale('log')q1_graph.set_xticks(x_ticks)q1_graph.set_xticklabels(x_labels, size =30)y_ticks = [0, 20, 30, 40, 50, 60, 70, 80, 90]y_labels = [str(x) for x in x_ticks]q1_graph.set_yticks(y_ticks)q1_graph.set_yticklabels(y_labels, size =30)#Add 2022 Watermark & Income Levelq1_graph.text(0.5, 0.5, '2 0 1 0', transform = q1_graph.transAxes, fontsize =400, color='silver', alpha=0.5, ha='center', va='center', zorder =1)q1_graph.text(0.5, 0.98, "INCOME LEVEL 1 LEVEL 2 LEVEL 3 LEVEL 4", transform = q1_graph.transAxes, fontsize =50, color='gray', alpha=0.5, ha='center', va='top')#Add annotations#Add labelsplt.xlabel("Income", fontweight ="bold", font ="Sans Serif", color ="darkslategray", fontsize ='50', horizontalalignment='center')plt.ylabel("Life Expectancy", fontweight ="bold", font ="Sans Serif", color ="darkslategray", fontsize ='50', verticalalignment='center', labelpad =30)#Show the graphplt.show()
# Define color palettecolors = ["#FF798F", "#7FEB00", "#00D5E9", "#FFE600"]# Filter the data to exclude "nan" values in the "four_regions" variableq1_filtered = q1.dropna(subset=['four_regions'])# Create the Plotnine plotplot1a = (ggplot(q1_filtered, aes(x="four_regions", y="income", fill="four_regions"))+ geom_boxplot()+ scale_fill_manual(values=colors) # Set color palette+ labs(x="Regions", y="Income")+ scale_y_log10() # Set y-axis scale to logarithmic+ theme_minimal() # Set theme to minimal+ theme(figure_size=(25, 15)) # Adjust figure size+ theme(text=element_text(size=20), axis_title=element_text(size=30, face="bold"), legend_title=element_text(size=20)) # Adjust text and axis label sizes+ theme(panel_grid_major=element_text(color="gray", alpha=0.5)) # Add gridlines+ labs(fill="Regions")+ theme(legend_position=(0.75, 0.9)) # Move legend to the right )plot1b = (ggplot(q1_filtered, aes(x="four_regions", y="life_exp", fill="four_regions"))+ geom_boxplot()+ scale_fill_manual(values=colors) # Set color palette+ labs(x="Regions", y="Life Expectancy")+ scale_y_log10() # Set y-axis scale to logarithmic+ theme_minimal() # Set theme to minimal+ theme(figure_size=(25, 15)) # Adjust figure size+ theme(text=element_text(size=20), axis_title=element_text(size=30, face="bold"), legend_title=element_text(size=20)) # Adjust text and axis label sizes+ theme(panel_grid_major=element_text(color="gray", alpha=0.5)) # Add gridlines+ labs(fill="Regions")+ theme(legend_position=(0.2, 0.9)) # Move legend to the left )plot1c = (ggplot(q1_filtered, aes(x="four_regions", y="population", fill="four_regions"))+ geom_boxplot()+ scale_fill_manual(values=colors) # Set color palette+ labs(x="Regions", y="Population")+ scale_y_log10() # Set y-axis scale to logarithmic+ theme_minimal() # Set theme to minimal+ theme(figure_size=(25, 15)) # Adjust figure size+ theme(text=element_text(size=20), axis_title=element_text(size=30, face="bold"), legend_title=element_text(size=20)) # Adjust text and axis label sizes+ theme(panel_grid_major=element_text(color="gray", alpha=0.5)) # Add gridlines+ labs(fill="Regions")+ theme(legend_position=(0.2, 0.9)) # Move legend to the left )# Show the plotprint(plot1a)print(plot1b)print(plot1c)
This is a spectacular graph that is clearly superior to the other for the following reasons: Pros: -Each region is organized into neat and beautiful different bars, because of that, you don’t have to count the individual bubbles so that allows for ease of interpretation. -Outliers are organized in a cleaner manner so you don’t have to stretch your plot out to include them like how we did in the other graph. -We can see the relationship between income, population, and life expectancy for each region.
Cons: -The relationships are not reprensented in one graph like the bubble plot.
This plot is not a good plot because it failed to show the relationship between imports and exports. As we can see here, it is ineffective because it was only able to show exports and its frequency at best. Furthermore, it also failed to incorporate to other important variables, the continent (represented by different colors in the bubble graph) and the energy (represented by the different sizes of the bubble graph).
There are also the labels of “Individuals using the Internet” on the x-axis and “GDP/capita” on the y-axis
There is also a watermark of 2001 behind the graph
#Style the seaborn themesb.set_theme()sb.set_style('whitegrid')colors = ["#FF798F", "#7FEB00", "#00D5E9", "#FFE600"]sb.set_palette(sb.color_palette(colors))#Set notebook's figure sizeplt.rcParams['figure.figsize'] = (30, 20)#Set up bubble plotq3_graph = sb.scatterplot(data = q3, x ="internet_users", y ="gdp", size ="income", legend =False, alpha =0.9, hue ="four_regions", edgecolor ="black", linewidth =2, sizes = (60, 6000), zorder =3)#Add 2022 Watermark & Income Levelq3_graph.text(0.5, 0.5, '2 0 0 1', transform = q3_graph.transAxes, fontsize =450, color='silver', alpha=0.5, ha='center', va='center', zorder =1)#Style fonts for x-axis & y-axis labelsx_ticks = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90]x_labels = [str(x) for x in x_ticks]q1_graph.set_xticks(x_ticks)q1_graph.set_xticklabels(x_labels, size =30)y_ticks = [200, 500, 1000, 2000, 5000, 10000, 20000, 50000, 100000]y_labels = [str(y) for y in y_ticks]q3_graph.set_yscale('log')q3_graph.set_yticks(y_ticks)q3_graph.set_xticklabels(x_labels, size =30)q3_graph.set_yticklabels(y_labels, size =30)#Add labelsplt.xlabel("Individuals using the Internet", fontweight ="bold", font ="Sans Serif", color ="darkslategray", fontsize ='35', horizontalalignment='center', labelpad =20)plt.ylabel("GDP/capita", fontweight ="bold", font ="Sans Serif", color ="darkslategray", fontsize ='35', verticalalignment='center', labelpad =40)
/var/folders/6w/jjc81v4j2yd2ccrn72bgbl3r0000gn/T/ipykernel_18930/265318195.py:52: UserWarning: FixedFormatter should only be used together with FixedLocator
Compared to the bubble plot, this plot is ineffective because it failed to show the relationship between individual using the internet and the GDP for each of the regions